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Abstract 


In this thesis work we propose a new technique for the design and develop- 
ment of an automatic visual identification system The proposed system is 
implemented using an interconnection of four subsystems: (i) sensing, (ii) 
data acquisition, (lii) feature (content) extraction and (iv) feature analysis. 
This system is based on identification of images using content based match- 
ing of a query image with those of the database images Query image is the 
on-line grabbed image by the area scan CCD cameras. Database ( off-line ) 
is prepared for all the expected query images by extracting relevant features. 

Image histogram ( gray level ) is used for feature extraction. The com- 
putational complexity &; storage requirements are reduced by decomposing 
the histogram using Wavelet Transform. First and second moments of these 
wavelet coefficients is used as features. The root mean square (rms ) metric 
is used to compute the distance between the query image with that of the 
database images 

Although the system is designed to inspect steel slab for surface defects 
but laboratory evaluation of this system gives excellent performance with 
general textured and non-textured images also. A setup of high speed Digital 
Signal Processors is used to keep up with the required real time throughput 
rates of 1024 Kpixels/sec ( 1 m wide steel slab moving at the rate of 1 m/sec 
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Chapter 1 


Introduction 


Quality control is an important aspect of today’s highly competitive indus- 
trial production. One important way to improve the quality of product is 
to inspect the product at each level of production cycle. Manual inspection 
is difficult , time consuming, costly and might impact the effectiveness of 
human labor because of hazardous environment of industry. High depen- 
dency of inspection process on human experience and expertise and also to 
obtain performance beyond the limit of human ability, has prompted the 
development of mtelhgent programmable vision based system for inspection. 
Economic motivation for the use of computer vision is to increase the pro- 
ductivity. 

Thus capital formation is often linked to technical innovation, which 
can produce higher productivity It is our basic postulate that automatic 
inspection system will raise both labor and general productivity. Limitations 
and difficulties of automatic web material^ inspection system development 
are summarized in next section. 

^The term web material refers to the materials produced in the form of continuous 
rolls Web processing is used in many segments of industry, e.g. metals, papers, plastics 
and textiles. 
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1.1 Issues In Web Material Inspection 


Web materials take many forms; however, there is remarkable similarity m 
the requirements for web inspection technology which cut across the major 
industrial segments. Typically, materials are homogeneous and discrepancies 
from homogeneity are interpreted as flaws. Web material inspection primarily 
relies on two dimensional image understanding in contrast to parts inspec- 
tion, which is inherently a three dimensional image understanding problem. 

Based on generic characteristics, web materials can be categorized in 
to uniform and texture materials. In both categories, inspection is currently 
performed either subjectively (by visual inspection) or objectively by destruc- 
tive or non-destructive testing. Examples of uniform materials include metal, 
films, paper and various plastics. Identification and classification of defects in 
these materials is currently the highest technical priority. Texture web mate- 
rial can be divided into regular texture (e.g printed textiles, printed currency) 
and random textures (e.g. non woven materials). In regular textures any dis- 
crepancy from predetermined pattern is viewed as defect. Frequently, these 
problem require color processing with objective is to evaluate color unifor- 
mity and consistency, which is essential in many materials, e.g. wall paper. 
Random texture materials require grading the overall and/or local texture 
quality. In some cases the quality of these materials is evaluated through 
performance tests, e g. by measuring pressure drop across graded surface. 
In this thesis work we will be discussing the surface defect identification of 
uniform materials, namely flat cold rolled steel sheets. 

Several phenomena is responsible for distribution of light scattered 
from a metallic Surface. The most important effect is the variation of surface 
normal. The variation of surface normal generally occur on two scales, a 
fine scale variation representing basic surface roughness and a rough scale 
waviness resulting from surface irregularities of greater spacing than the fine 
scale variation. Variation in surface roughness results in variation in su^l'ace 
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SURFACE 


normal, hence the reflected power varies. Fig.l. shows the scattering geome- 
try of metal surface. Let Eq is proportional to power scattered by a smooth 
plane of area A, then it is given by [6], 


,2 _ k^A'^P 


( 1 . 1 ) 


Where k = ^, 

A wavelength of illumination, 

I cosine of incident angle i. 

To distance from the observer to the plane. 

Then the reflected power from rough surface is given by 

P =< pp* > (1.2) 

Where p is the reflection coefiicient for scattered power. 

Presently, in most steel industry, the hot slab coming out of a caster 
is cooled to the ambient temperature. An inspector then manually inspects 
the slab in order to detect surface imperfections. This manual inspection 
takes lots of time and slow down the overall production rate. Once the slab 
has been determined to be free of imperfections, it has to be re heated again 
for further processing. If an automatic inspection system is put, in place 
of manual inspection process, it will not only increase the production rate 
but also avoid the intermediate cooling and heating which is necessary for 
manual inspection, thus we can save lots of energy. 
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1.1.1 Issues in uniform web material inspection 

At present there are commercially available system which can detect the 
presence or absence of defect at very reasonable cost. However, the problem 
of defect identification, i e the determination of type of defect is still an open 
research issue. The major obstacles in solving the identification problem are 
the following: 

• high data throughput; A typical web material is 1-3 m wide and 
moves with a speed of 1-10 m/sec. Consequently, data throughput for 
100% inspection (when identifying defect of mm size) is tremendous 
and cannot be handled by present general purpose hardware. 

• Inter-class similarity and intra-class diversity: A single class of 
defect may vary widely in appearance and may have members that 
closely resemble defect in other classes. Therefore, the structure of the 
given class in a feature space may be of a very complex nature. 

• Large number of classes : A typical defect classification problem 
involves a large number of defect classes; it is not unusual to deal with 
a few dozen classes. 

• Non availability of adequate imperfection imagery. Another 
very significant problem encountered in development of inspection sys- 
tem is the non availability of adequate imperfection imagery for feature 
extraction and system training. The collection of imagery data is ham- 
pered by the hazardous environment in the steel industry This problem 
is more acute by the fact that the large number of imperfection classes 
require a large training set of imperfect samples. 

• Dynamic defect populations; Small changes in production process 
can result in entirely new classes of defects and a useful classification 
system should be dynamic with the ability for continuous on-line learn- 
ing. 
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The first four items make initial system design -very difficult whereas 
fifth item is a major obstacle in extending the useful lifespan of developed 
system. 

A past study of the feasibility of on-line inspection of a fiat rolled steel 
has been reported by Saridis and Brandin [7]. Recent reviews of automatic 
inspection as applied to industrial inspection in general, is given in[8] 

1.2 Existing Techniques of Defect Identifica- 
tion: A Brief Survey 

1.2.1 Edge Detection 

In this technique of defect identification, edges of the on-line images are de- 
tected. Edges which are present due to noise are discarded by means of 
filtering operation. After detection of relevant edges, thresholding operation 
is performed on the image. Presence of edges in the image shows the presence 
of defect. To identify the type of defect, i.e for defect characterization struc- 
tural features (shape and size of blobs) of on-line image are extracted. For 
defect characterization structural features of on-line images are matched with 
those of data base images. Data base will have information about structural 
features of all most all known defects class. 

The main problem with this technique of defect identification is that 
it requires extensive learning of different defects geometry over an extended 
period. Parameters which affects the manufacturing process are very sensi- 
tive to industry setup and technical resources also, which results in very high 
intra-class diversity and inter-class similarity 
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1.2.2 Profile Matching 


In this method of defect analysis profile of on-line images are calculated. It is 
assumed that profile of non-defective sheets are almost smooth and constant, 
whereas profile of defective sheet will be zig-zag Nature of profiles will be 
different for different defects class. 

Determination of metal surface profile; For metal surface profile cal- 
culation metal surface reflective power is measured in indirect way, and unit 
of reflective power is ”gray-level” (0-255) in the digitized images. To make 
things manageable, it is assumed that the brightest point on in the image 
has surface normal located on the axis of incident light. Based on this as- 
sumption numerical relationship between digitized image and incident angle 
can be calculated. From the incident angle surface profile can be calculated 
[ 6 ]. 


This technique is very efficient for defect detection but its performance 
is very poor in defect characterization. 

1.2.3 Smart Sensing and Cortical Projection 

In this technique of defect classification smart sensors are used for image 
acquisition rather than the conventional ones. By conventional we refers to 
sensors with uniform spatial resolution and matrix organization of sensing 
elements. In this approach, sensing strategy is determined by the defect 
detection subsystem. Similar to the target tracking system, the inspection 
system requires the broad view (low resolution image) and based on that de- 
termines the presence of potential defects. The presence of potential defect 
alerts the sensing subsystem through feedback loop. Then, the space invari- 
ant sensor foveates on the region specified by the defect detection subsystem 
to provide the detailed view of potential defect. The switching of sensing 
strategy is motivated by the desire to reduce data throughput. This sensor 
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organization is particularly suitable for obtaining a cortical projection [13] 
of an image i.e a geometrical transformation associated with human visual 
system. Cortical projection simplifies the rotation and scaling effect of object 
m image plane. 

1.2.4 Content Based Technique 

Content based imaging technique is increasingly becoming popular in the 
field of data management and classification because of its ability to reduce 
the data dimensionality. In content based technique significant features of 
the image under consideration are extracted. These features are selected in 
such a way that they form a good discrimination boundary between different 
class of images and are extracted with ease. 

In this thesis we have used content based imaging technique for surface 
defect identification of flat cold rolled steel sheet. Imagery of sheets are 
grabbed on-line in synchronism with the speed of sheet. Intensity histogram 
is used for feature extraction Since the similarity metric used in histogram 
comparison is of quadratic form, they are computationally very expensive. 
This computational complexity is the major bottleneck in the design and 
development of real-time inspection system. To reduce the computational 
complexity and storage requirements the histogram is represented at varying 
resolutions using wavelet transform Feature vector is formed by taking 
first and second moments of wavelets coefficients at all the decomposition 
levels. Database (off-line) is created by extracting the features and taking 
their average value.Finally root mean distance is used as a similarity measure. 
To keep up with the required data throughput a setup of high speed DSP 
parallel processors is used. 
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1.3 Thesis Organization 


The thesis is organized in the following manner. In chapter 2, we present 
the feature selection criteria and their extraction details. Inspection sys- 
tem overview and details of the high speed processor setup architecture are 
presented in chapter 3. Chapter 4 presents the results obtained using the 
proposed system both for steel sheet images and general textured images. 
Conclusions and scope for future work is discussed in chapter 5. 
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Chapter 2 


Feature Selection and 
Extraction 


Any object or pattern which can be identified possesses a number of dis- 
criminatory properties or features. The first step in any identification or 
classification process, performed either by a machine or human being, is to 
select the discriminatory features followed by a method to extract (measure) 
them It is evident that the number of features needed to successfully per- 
form a given identification task depends on the discriminatory qualities of 
the chosen features. However, the problem of feature selection is complicated 
by the fact that the most important features are not easily measurable, or, 
in many cases their measurement is inhibited by economic considerations. 

Feature selection and extraction plays central role in image identifi- 
cation. In fact, the selection of appropriate set of features which take in 
to account the diflficulties present in the extraction process and at the same 
time result in acceptable performance, is one of the most difiicult task the 
in design of an identification system. Broadly, features can be classified into 
three categories: (i) physical feature, (ii) structural feature, and (iii) math- 
ematical features. Physical and structural features are commonly used by 
human beings because these features are easily detected by touch, or by the 
eye, or any other sensory organs. However, when machines are designed to 
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identify objects, the effectiveness of these features may be sharply reduced 
since the capabilities of human sensory organs are difficult to imitate in most 
practical situations However, machines can be designed to extract mathe- 
matical features of object of interest which a human may have some difficulty 
in determining without mechanical aid Examples of these types of features 
are statistical means, variances, eigenvalues , eigenvectors and other invariant 
properties. 

In content-based imaging technique the significant features of images 
are extracted and used to discriminate between the different class of images. 
Details of feature selection is given in section 2.1. Feature extraction tech- 
nique is explained in detail in section 2 2. Section 2.3 Explains the similarity 
technique used in identification. 


2.1 Feature Selection for Surface Defect Iden- 
tification 

Our aim is to design a real time inspection system, i.e the system should be 
able to process and identify the present frame before the arrival of the next 
frame. Although, by using dedicated high speed parallel processor setup, sys- 
tem throughput can be improved, the inter-processor communication over- 
load and resource limitation, limits the system performance and in practice 
it happens that the system performance fails to improve by increasing the 
number of processors used. As the number of processor increases the re- 
quired resources also increases, economic constraints may fail to meet the 
demands. Amount of parallelism which can be achieved not only depends 
on the hardware parallelism but also on the capacity of algorithm to run in 
parallel. As a specific example, the following points should be kept in mind 
while selecting features for surface defect identification: 

• Steel sheet moving on conveyer belt may experience some amount of 
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transverse movement, which can result in translation or rotation of the 
image m the image plane. Feature vectors should be invariant to these 
effects. 

• Feature vectors should be such that, it does not overload the feature 
extraction process. 

Intensity information is an important attribute of image representa- 
tion. Intensity distribution of an image is generally represented by histogram. 
Histogram has been found to be a significant feature of an image. As his- 
togram is invariant to translation, rotation and viewing axis, it becomes the 
ideal choice for feature for on-line identification problem. Other advantage of 
histogram is, it converts two dimensional image processing problem to a one 
dimensional distribution problem, i.e, it eases the computational complexity. 

2.1.1 Histogram as Feature Vector 

The histogram of an image /„ is an N dimensional vector = 

1, 2, 3, ... , N}, where N is number of bins and H( fn, i) is the number of 
pixels having intensity i. Given a pair of of images fn and fm, the similarity 
between the two images may be measured using the normalized intersection 
[10] of their histograms given by, 

Ez=i min{H{fn, i)) ..v 

ZliHiUi) ^ ^ 

In this metric intersection, measure is incremented by the number of 
pixels which are common between the data base image and grabbed on-line 
(query) image. The measure is finally divided by the total number of pixels in 
the query image as a normalization factor. It has been shown [10] this metric 
is fairly insensitive to image resolution, histogram size, occlusion, depth and 
viewpoint. However, histogram intersection does not take in to account the 
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perceptual similarity between the different bins of histogram. A metric [12] 
which take in to account the similarity between the bins is defined in Eq. 2.2. 


Dist(f„U) = (2.2) 

1=1 J =1 


where weights is the cross correlation between bins i and j of his- 
togram. The weights can be normalized so that 0 < Ujj < 1, with aii=l, 
and large denoting similarity between bins i and j, and small the dis- 
similarity. The two distributions H( /„ ) and H( fm ) can also be normalized 
so that 0 <E{ fn),E{ fm) <l and T,,H{fn,i) = 0=1- 

The distance or more precisely the pseudo-metric shown in Eq. 2.2 is 
of quadratic form and since histogram is also a high dimensional (N=256) 
distribution, this distance measure is computationally very expensive. For 
an image of size X x Y, histogram requires 0(XY) addition and 0(XY) 
increments. In addition 0{N‘^) operations is required to compare a pair 
of histograms, though this can be improved to order 0(N) by some pre- 
computation(e g. by diagonalizing the quadratic form). Moreover because of 
the presence of a large number of defect class, it is generally not feasible to 
compute the match measure against every image (0(M) computation if M is 
the size of the database). Thus, while the histogram has proven to be a good 
feature, the computational complexity is an inhibiting factor as the size of 
database increases Hence, the problem at hand is to define a considerably 
less expensive measure on considerably lower dimensional features so that: 

• Overall computational complexity can be reduced in order for the sys- 
tem to work in real time. 

• The database can be organized in terms of the lower dimension feature 
vectors, thereby reducing the storage requirements. 
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To reduce the computational complexity and storage requirements his- 
togram IS decomposed using wavelet transform. In the next section we de- 
scribe the multiresolution decomposition of the histogram. 


2.2 Multiresolution Representation of the His- 
togram 


Multiresolution is widely appreciated in the field of computer vision and im- 
age processing, because of its characteristics to decompose and analyze the 
signal or image, at varying resolutions, which is very similar to human visual 
system. Histogram of an image is discrete one dimensional distribution of 
intensity in gray-level space. Multiresolution can be used to analyze this dis- 
tribution at various resolution. Some heuristic reasons to try out histogram 
representation at different resolutions are given below : 

• Typically, image histogram is sparse with the pixels intensities being 
concentrated in a few regions of intensity space. This is true even 
for histogram where intensity axis is coarsely quantized. Even his- 
tograms with 256 bins has many bins with very few or no pixels. Con- 
sequently, treating the histogram as a simple N-dimensional vector is 
not an efficient representation especially since the high dimensionality 
has been the main problem with the computational complexity of his- 
togram comparison techniques. In addition, there is definitely a certain 
degree of correlation among the adjacent bins of histograms in terms 
of pixels count. Thus the sparse nature of histogram together with 
adjacent bin correlation will lead to many coefficients having small or 
null values in a multiresolution decomposition scheme. 

• Multiresolution representation decomposes the signal or distribution in 
two different components. Coarse component gives information about 
the pixel concentration in gray-level space whereas contrast information 
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of an image can be extracted from the detail component. 

• Multiresolution representation emphasizes different characteristics of a 
signal or its distribution at different levels. For example, the coarser 
resolution can give an idea about the relative amount of intensities in 
different section of very roughly divided gray-level space while the in- 
dices of coefficients with largest magnitudes in lower level can more 
precisely point to the ” edges" or to the regions of pixel values concen- 
tration in the gray-level space. 

• One of the advantages of using color histogram is its low computa- 
tional complexity in the feature extraction process. Since multiresolu- 
tion decomposition is fast and easy to compute, this advantage is not 
compromised. 

2.2.1 The Wavelet Representation 

In a multiresolution technique, the signal is approximated at various res- 
olutions. Let A 23 denote an operator which approximates signal at reso- 
lution 2 ^ . Difference of information between signal approximation at res- 
olution 2^ and approximation at 2'^+^ can be extracted by decomposing 
the signal or function in a wavelet orthonormal basis. This decomposi- 
tion defines complete and orthogonal multiresolution representation called 
” wavelet representation” . There exists a function (Wavelet) (x) such that 
— k); {j,k) £ is an orthonormal basis of T^(R) (vector space 
of measurable square -integrable one dimensional function f(x)). These ba- 
sis can realize the Haar basis. These basis vectors are localized in both the 
time and frequency plane. This localization property of wavelet makes it 
convenient to analyze low frequency content of signal at long-basis whereas 
high frequency content at short-basis function. Properties of approximation 
operator is given in Appendix A.l 

Block diagram of a 1-dimensional wavelet transform is given in Fig. 2.1. 
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Figure 2 1; Block Diagram of Wavelet Transform 
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In this figure, filters G and H are low and high pass filters respectively. The 
coefficients of these filters depend upon the type of wavelet selected for de- 
composition. Output of filters are sent to dyadic down samplers. Detail co- 
efficients of first level is stored for extraction of significant features whereas 
coarse component is applied to next level decomposer for further decompo- 
sition This procedure is repeated till the last level of decomposition. Fast 
discrete one dimensional wavelet transform implementation is explained in 
Appendix A. 2. 


2.3 Feature Extraction 

Feature vectors of an on-line sheet image are extracted from the histogram 
of that image. Histogram of on-line images are applied as a input to the 
multiresolution decomposition filters Filtered outputs of histogram are down 
sampled by two (2) to get coarse and detail outputs respectively. Details 
coefiicients are stored as it is, and coarse (approximate) component of first 
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level is applied to decomposer again for next level of decomposition. This 
procedure is repeated again till the last level of decomposition. Number of 
levels till which the signal should be decomposed is application dependent 
and an optimum level of decomposition required can be found experimentally. 
As a specific example, consider the histogram of N=256 being decomposed up 
to three levels, the number of coefficients in the first, second and third level 
of decomposition are 128, 64 and 32 respectively If filters are implemented 
as a convolution then the number of coefficients for a wavelet having two 2 
coefficients are 129, 65 and 33 respectively, i.e data is compressed as the level 
of decomposition increases. 

2.3.1 Feature Vector 

As is true in most fields which deal with measuring and interpreting physical 
events, statistical considerations become important in image classification 
because of the randomness under which image are grabbed. Especially in 
industrial application statistical considerations become important because 
number of parameters which are responsible for intra-class variation and 
inter-class similarity are very large and the reason of these parameters vari- 
ation is also not well defined In this defect identification problem, feature 
vectors are formed by calculating the first and second moments (mathemat- 
ical features) i e mean and variance of wavelet coefficients. Let Vh^st denotes 
the variance of histogram. Mdeti, Vdeti, Mdeti, Vdet 2 , Mdets, Vdets and Kpps 
denotes the mean, variances of first, second, third level detail coefficients and 
mean of approximation coefficients at level three, then the feature vector is 
given by Eq. 2.3. 

F SdtUTBV ectOT = {Vhistf ^detl^ ^detl: ^det2i '^det2i ^detZi Lapps} (2-3) 

These feature vectors are used for comparison of the on-line image with those 
in the data base. Database is also prepared by extracting features of probable 
defects in the same way but it is an off-line process. 
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2.3.2 Similarity Metrics 


Our basic assumption in feature vector comparison is that the two images 
which are similar in nature or belongs to the same class will be similar in 
feature space i.e they will have very small distance in feature space. Let Qw 
and Tw be the set of features of two different images, where W is the number 
of members in each feature vector. 

£ 3 * = {Vl„, Min, Vln. Min, Vln, Vl^, VS„], 

Tw = ^det2i ^det2-> ^det2'’ ^det2^ ^detZi ^detz} ■ 


The root mean square {rms) metric is used to compute the distance 
between the query image with that of the data base images RMS distance 
Dist{Qw,Tw) is given by Eq. 2.4: 

DistiQw, Tw) = (2.4) 
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Chapter 3 


Inspection System Overview 


The steel sheets produced in steel industry are susceptible to wide a variety 
of surface imperfections. A listing of some of the commonly occurring defects 
is given in table 3 1. 


Table 3.1; Name of Defects 


Coil BreaJj 

Roll Mark 

Black Patch 

Round Scratch 

Linear Scratch 

Pinch Mark 

Color Annealing 

Rust in Scale 

Iron Particle 

Surface Wav mess 


Our aim is to design a real time visual inspection system for detec- 
tion and classification of surface defects. The proposed inspection system is 
shown in Fig. 3.1. The sheet surface is illuminated with an artificial source 
of light. The camera (video) views the sheet in a direction transverse to the 
sheet motion. As the sheet moves along the conveyer belt, an analog camera 
grabs the image arriving in its field of view at a rate which is in synchronism 
with the movement of sheet in transverse direction. Position camera records 
the position of sheet from a preset reference position (In some commercially 
available camera, position can also be recorded together with image acqui- 
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sition). Video signal (IVpp) is routed to data acquisition circuit via BNC 
connector A fast ADC is u.sed to digitize the image. The digitized images 
are sent to preprocessing circuit of the image grabber card (section 3 2) to 
remove discrepancy (if any) present due to non-uniform illumination. Finally 
preprocessed image is stored in video RAM (VRAM) of image grabber card 
for further processing. Monitor 1. is required to see the on-line identification 
report whereas Monitor 2. is optional and required only when we want to 
display the on-line sheet images. Brief description of each of the submodules 
used in proposed system are given in following subsections. 


Figure 3.1: Block Diagram of Inspection System 





MOVING SLAB 


This system will record the date, time and position of each defect, so 
that these parameter may be taken in to account at a later time e.g. when 
cutting material. A typical identification report is shown in Fig 3.2. 
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Figure 3.2: Identification Report 


DATE 

12:01:1999 

TIME 

00:01:48 

POSITION 


DEFECT TYPE 

COIL BREAK 


3.1 System Description 

In order to illustrate data rates expected in material inspection we consider 
a specific case For example, consider a Im wide sheet moving at speed of 
Im/sec and the system has to identify defects of 1mm (both in horizontal 
and vertical direction). Also assume that defect need to be represented by a 
minimum of 2 pixels in both the horizontal and vertical directions (i.e. spatial 
resolution of 0.5mm/pixel). It is necessary to place two Charge Coupled 
Device (CCD) cameras operating at 512 x 512 pixels to cover the cross sheet 
direction. In order to keep up with the moving sheet, which travels at the 
rate of Im/sec, it is necessary to acquire 2 frame/sec. If the CCD camera 
is operating at 1024 x 1024 pixels, then we need only one camera to cover 
cross sheet direction and also only 1 (one) frame/sec is required to keep up 
with the moving sheet, which travels at Im/sec. Then the defect detection 
system receives in total 2 x 2 x 512 x 512 = 1024 Kpixels/sec. The need 
to keep up with this throughput has dictated our design philosophy of both 
the algorithms and hardware architecture respectively. Block diagram of 
complete hardware setup is given in Fig. 3.3. Brief introduction of each 
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module of this system are presented in following subsections. 


Figure 3.3: Block Diagram of System Hardware 



3.1.1 Frame Grabber Card 

As shown in Fig. 3.3 frame grabber card mainly consists of four intercon- 
nected submodules, they are (i) Analog interface and anti aliasing filter, 
(ii) Analog to digital converter, (iii) Graphics signal processor (GSP) , (iv) 
Digital signal processor (DSP), and (v) Video RAM, Dynamic RAM. Video 
signal of camera are routed to analog to digital converter (ADC) of interface 
circuit Anti-aliasing filter is used to band limit the signal. Two software se- 
lectable anti-aliasing filter are available, one whose corner frequency fixed at 
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20MHz and second is programmable to have a corner frequency from IMHz 
to lOMHz. Depending on the requirements the sampling frequency can be 
adjusted from 510KHz to 40MHz in increments of less than lOKHz. Digitized 
image is finally stored in to Video RAM for display and further processing. 
In case, more than one camera operating simultaneously, image frame are 
integrated before storing it to VRAM. 

Graphics Signal Processor (GSP) TMS34020, is 40MHz advanced pro- 
cessor Its unique array addressing support and efficient manipulation of 
hardware-supported data types such as pixels and 2-dimensional pixel ar- 
rays makes it very useful in imaging and graphics applications. The GSP 
architecture provides high performance in moving large blocks of data, data 
management, display control and refresh, image processing, and host commu- 
nications The TMS34020 is also responsible for all operations within graph- 
ics overlay plane. Digital Signal Processor (DSP) TMS320C40, mounted on 
frame grabber card is a 40MHz, 32 bits floating point processor and can be 
programmed for specific application. 

Frame buffer memory on grabber card is composed of VRAM and 
DRAM. Video RAM memory can be used for image display and both VRAM 
and DRAM memory can be used for acquisition and processing. The total 
memory available on grabber card is divided in to logical partitions called 
frame buffers. These logical frame buffers are Acquisition Frame Buffer 
(FFB), Display Frame Buffer (DFB) and Processing Frame Buffer (PFB). 
The frame buffer dimensions are selected in the driver (Occulus) configu- 
ration program. The size of logical frame buffers are same in VRAM and 
DRAM. VRAM frame buffer memory can be accessed either by GSP or DSP. 
VRAM frame buffer memory has unique set of frame buffer pointers called 
(FFB, DFB, PFB) DRAM memory has unique set of frame buffer pointers 
and can be accessed by DSP only. 


22 



3.1.2 Parallel Processor Card 


Parallel Processor Card (PPC) is specially designed to meet the requirements 
of real time processing and embedded system development. It has two very 
high speed programmable digital signal processors. These processors can be 
programmed for digital signal processing algorithm as well as other processing 
algorithms. PPC has local as well as global memory. As shown in Fig. 3.3 
the local memory can be accessed by the processor which is connected to it. 
Global memory is connected to ISA bus and can be accessed by PC as well 
as the processors. Some of the key features of TMS320C40 are as follows- 

• 40-ns and 50-ns instruction cycle time 

• High precision and wide dynamic range of floating point (40/32-bits) 
unit. 

• 40/32-bit single-cycle floating point/integer multiplier for high perfor- 
mance in computationally-intensive algorithms. 

• Hardware divide and inverse square root support. 

• Single-cycle barrel shifter for 0-31 single-cycle right or left shifts for fast 
bit manipulation. 

• Separate internal program, data, and DMA coprocessor busses for 
support of massive concurrent I/O of program and data throughput, 
thereby maximizing sustained CPU performance. 

• Two identical external data and address busses supporting shared mem- 
ory systems and high data rate, single-cycle transfers. 

• On-chip program cache and dual access/single-cycle RAM for increased 
memory access performance. 

• Six communication ports for high-speed (20 Mbytes/sec asynchronous 
transfer rate at each port) interprocessor communication. 
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• Six-channel DMA coprocessor for concurrent I/O and CPU opera- 
tion, thereby maximizing sustained CPU performance by alleviating 
the CPU of burdensome I/O. 

• Single cycle IEEE floating floating point conversion for efficient inter- 
face to IEEE-computable processors. 

3.2 Communication Protocol 

As shown in Fig. 3.3 the proposed system has to work in multi-processor 
(parallel processor) environment. Processor-to-processor communication is 
critical m multiprocessor system design. Generally, in parallel processor sys- 
tem, application are divided in to number of dependent and independent 
task. The task which are dependent requires to share the data of another 
task operating simultaneously on the other processor This section explains 
the interprocessor communication protocol of all the processors involved in 
proposed system design. 

3.2.1 Communication Protocol between Host to 
GSP /DSP of Frame Grabber Card 

Usually in parallel processing environment task management is done by ap- 
plication developer. Main program which takes care for system initialization, 
memory management, data acquisition, synchronizing the start of PPG pro- 
gram (which executes on C40 DSP processor), controlling the frame grabber 
and setting the processing window size runs on Host. Host program also 
takes care of the user interface and data transfer to and from the secondary 
storage device. The data which has to be processed are downloaded in the 
memory map of processors The DSP/GSP processors mounted on frame 
grabber card is installed in ISA bus slot of PC (Host). The state diagram 
shown in Fig. 3.4 explains the communication protocol between Host and 


24 



GSP /DSP. This diagram is drawn, assuming only valid ODX command are 
given, error handling are not detailed here. 


Figure 3.4 Host to GSP/DSP Communication Protocol 



DATA 


PROCESSOR 0:GSP(Grapliical Signal Proceffior) 
PROCESSOR 1: DSP (Digital Signal Proce^) 


Sequence of Operation for Standard GSP/DSP firmwaire 

• Host Occulus ODX command and data are received by GSP. 

• If GSP is active processor it will execute the command. 

• If DSP is active processor, the command is passed to the DSP, if the 
command is within the DSP ODX instruction set. Else the GSP will 
execute the command without notification or error. 

Host-GSP communication state transition diagram 

Fig 3.5 shows the communication state transition diagram of Host to GSP. 
Transition states and the communication functions of the Host-GSP commu- 
nication are as follows: 
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Figure 3.5: Host To Gsp Communication State Transition Diagram 



• A host ODX command is sent to the GSP The GSP goes to receive 
state, and the ODX command is received. If the command has data, 
MSG-COPY 0 IS used to transfer the data from host to GSP. 

• MSG_WAIT() terminates the transfer from host to GSP (it is not re- 
quired if MSG_COPY is not used). 

• The GSP will process the command. Additional ODX command can 
be received while the GSP is in the process state if : 

— The ODX command buffer is not full (16 command deep). 

- The ODX command currently in progress or any commands in the 
buffer do not need to send message or data back to host. 

• If there is a data to be returned from a command process, MSG.COPY () 
IS used The command is then terminated by MSG-SEND() which re- 
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turns a command process value. For example, MSG_SEND(0) for a 
command executed without error. 

• After all commands have been processed and any data or messages sent 
back to the host, the GSP returns to the IDLE state. 

GSP-DSP communication State Transition Diagram 

The state transition diagram of GSP-DSP is given in Fig. 3.6. The GSP 
to DSP communication also follows the conventions of the PC host to GSP 
communications sequence described above. 


Figure 3.6; GSP-DSP Communication State Transition Diagram 



• An ODX command is sent to the DSP The DSP goes to the receive 
state, and the ODX command is received. If the command has data, 
MSG-COPY 0 IS used to transfer the data from GSP to DSP. 
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• MSG_WAIT() terminates the transfer between GSP to DSP. (It is not 
required if MSG-COPY is not used) 

• The DSP will process the command. Additional ODX commands from 
GSP can be received while the DSP is in process state if: 

— The command buffer is not full (16 deep buffer). 

- The command currently in progress or any commands in the buffer 
do not send messages or data back to the GSP. 

• If there is data to be returned from a command process, MSG-COPY 
is used. The command is then terminated by MSG-SEND() which 
returns a command process value. For example, MSG_SEND(0) for a 
command executed without error. 

• After all command have been processed and any data or messages sent 
back to the GSP, the DSP returns to the IDLE state. 

3.2.2 Communication between Host and PPC 

The DSP processor on frame grabber divides the grabbed image and dis- 
tributes the sub-image to C40s on PPC. The DSP processor on grabber card 
receives the processed data from the C40s of PPC board. To starts the pro- 
cessing on PPC board Host is required to initialize the the C40s of PPC, 
set/reset the flags and pass commands as and when required. Communica- 
tion can be through shared memory where the Host can directly write or read 
in the global memory location of PPC or using the interrupt scheme where 
PPC will be interrupting the Host or vice-versa or through status register 
of PPC processor. An easy way of communication is using shared memory 
scheme. Communication using shared memory is called as ” Semaphonng” . 
The Real Time Library (RTL) provides an easy and convenient way of com- 
municating with, and controlling the PPC. Therefore, by using RTL, the 
interface with PPC is implemented. 
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3.2.3 Communication between Frame Grabber DSP 
and PPG DSP 

Communication between DSP of frame grabber and PPG DSP is done through 
communication port connections by using the callable communication port 
functions. These callable functions are flexible to use with any DMA chan- 
nel and any communication port either in transmit mode or in receive mode. 
Since we have to communicate in both directions i e why we have used DMA 
coprocessor in spht mode. Split mode transforms one DMA channel in to 
two DMA channel to make two way communication feasible. The Primary 
Channel is dedicated for reading data from a location in the memory map 
(external/internal) and writing it to communication port output FIFO. The 
Auxiliary Channel is dedicated to receiving data from a communication 
port input FIFO and writing it to a location in the memory map A DMA 
channel in split mode can be used with any communication port; however, 
read/write synchronization is restricted to signals from its own communica- 
tion port, in other words, DMA-i can synchronize only with signals coming 
from communication port i. Appendix B.l shows typical split mode operation 
with one communication port. 

A split mode communication protocol is given below: 

• The primary channel reads word from the address pointed to by the 
source address register and writes it to temporary register within the 
DMA coprocessor. It then writes the temporary register value to the 
output FIFO on the communication port specifled in the COM PORT 
field. 

• The auxiliary channel reads a word from the input FIFO on the com- 
munication port specified in the COM PORT field and writes it to a 
temporary register within DMA coprocessor. It then writes the tempo- 
rary register value in the address pointed to by the destination address 
register 
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Chapter 4 


Implementation and 
Laboratory Evaluation 


In this chapter we discuss the implementation of the proposed identification 
system using TMS320C40 parallel processors setup. The system developed 
is evaluated using different types of textured and non-textured images. 


4.1 Implementation Techniques 

The proposed identification system mainly consists of two modules: the are 
hardware setup and the software module. It is very important in software de- 
sign and implementation to use the available resources to their full capacity. 
The algorithm discussed in chapter 2 will be used for implementation. The 
implementation techniques depend on the parallel processor configuration, 
on the type of processor used and resources available. To finalize the im- 
plementation scheme using the available resources (hardware) we measured 
the identification time required by the system. Lesser the time taken in 
identification better is the implementation technique. 

Although we implemented and tested the overall algorithm module by 
module by implementing the modules in various ways, m this thesis we con- 
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sider two slightly different implementation techniques. Fig 4.1 and Fig 4.2 
shows the flow chart of the implementation schemes 1 and 2 respectively 
As shown in Fig. 4 1 the wavelet transform is implemented using on all the 
three processors by dividing the histogram vectors in three different blocks 
and using the master processor features are extracted. In Fig. 4.2 the wavelet 
transform is implemented on only two slave processors (TMS320C40) and the 
coefficients of decomposition are sent parallely to master processor for feature 
extraction Data decimation technique which is part of wavelet transform is 
not only different for different blocks but also different at different levels of 
wavelet decomposition. Appendix C.l and C.2 gives the wavelet transform 
schemes for the three processors and two processors setup respectively, with 
Master Processor being used for calculating the Mean and Variance. 


4.2 Data Base: An Off-Line Process 

Data base preparation is an off-line process. In data base we store the rel- 
evant features of images, which are expected during on-line identification. 
Result of identification depends on the similarity measure values of the on- 
line incoming images with those of data base images. Minimum distance 
value is used to classify the images. Generally, data base feature vector is 
formed in the same way as the query image feature vector is formed. So, for 
data base features we have calculated the histogram of the sample images 
of all the available defective and non-defective classes. These histograms are 
decomposed using Haar wavelet up to three levels of decomposition. The 
Mean and Variance of wavelet coefficients at all the three levels of decompo- 
sition are used as features. Members of feature vectors are as follows: 

{Var. of histogram, Mean of detail coeff. at level 1, Var. of detail coeff. at 
level 1, Mean of detail coeff. at level 2, Var. of detail coeff. at level 2, Mean 
of detail coeff. at levels, Var. of detail coeflf. at level 3, Var. of coarse coeff. 
at level 3 }. 
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Figure 4 1: Flow Chart of Implementation Scheme 1 
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Figure 4 2: Flow Chart of Implementation Scheme 2 
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Data base feature of particular class is calculated from the average 
feature vector of sample images feature vectors of that class. The feature 
vectors for all the classes of image are stored permanently in data base mem- 
ory locations and compared for similarity with the query images. 


4.3 System Evaluation Using Steel Slabs Im- 
ages 

Figs. 4.3 and 4.4 show the images (480 x 360) of steel surfaces having different 
types of defects. Image histogram is shown on the right of respective images. 
Intra-class variations of the histogram of some of the commonly occurring 
class of images are shown in Figs. 4.5, 4.6, 4.7, 4.8 and 4.9. These images were 
grabbed on-line with the help of area scan camera as the steel sheet moved 
on the conveyer belt. Feature vector of grabbed images are formed on-line as 
the steel sheet passes through the field of view of area scan camera mounted 
in a direction transverse to the sheet motion. To extract the features we 
followed the same techniques that was used for database, but this time it is 
an on-line processing, i.e grabbed images are digitized to get the histogram 
and then it is further decomposed to three levels using wavelet transform. 
The coefficients of the decompositions are used to calculate the mean and 
variance. These features are arranged in the same way as we stored the data 
base feature vector 

To calculate the processing speed, the number of processors required 
and implementation technique which is best suited for the available hardware, 
various hardware and software configurations were tested. Table 4.1 shows 
the relative comparison of number of processors, algorithm implemented and 
processing speed for a high speed processor setup. Prom this table 4.1 it 
is clear that as the number of processors increase the identification time of 
the system decreases. As we know in a parallel processor environment, inter- 
processor communication plays a major role, and some times even it becomes 
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Fig. (c) Black Patch Steel Slab Defective Image and its Histogram 
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Fig. (c) Scratch Steel Slab Defective Image and its Histogram 
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Figure 4.8. Intra-class Histogram Variation of Linear Scratch Images 
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Table 4.1: Processor and Processing Speed: An Experimental Ob- 
servation 


No . of 

processors 

Time (ms) required for 
implementation scheme I 

Time (ms) required for 
implementation scheme II 

1 

329 

329 

2 

— i 

221 

__ 

218 

3 


117 


the major bottleneck in improving the performance of the system. As the 
number of processors increase the communication overhead also increases. 
We also note from tables 4.1 and 4 2 that, as the number of processor in- 

Table 4.2: Processors and Percentage Saving in Time 


Implementation scheme 

y, Saving 2 : 1 

y, Saving 3:1 

y. Saving 3:2 

I 

32.82 

60.48 

41.17 

II 

33.73 

64.43 

46.33 


creases the identification time saving is not in proportion to the increase in 
number of processors. This saving depends on communication involved and 
also on the implementation scheme. Implementation scheme II proves to be 
better with respect to identification time. We used this scheme for further 
evaluation of system performance. 

Although the number of defects occurring in steel industry is large, 
because of non availability of sufficient number of samples of other defective 
class, we have evaluated the performance of the system with five defective 
class of images only. They are as follows: 


Class A: 

Non-defective 

Class B: 

Coil Break 

Class C: 

Black Patch 

Class D: 

Scratch 
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Class E: 
Class F: 


Pinch 
Roll Mark 


In laboratory evaluation of this system we have simulated the on-line 
industrial inspection process by applying the already stored image frames 
(defective as well as non-defective and some of them are only used while 
testing the system). Laboratory evaluation of system gives 100% correct 
identification. The result of distance metrics for some samples of defective 
and non-defective classes are given in table 4.3. 


4.4 System Evaluation Using Textured and 
Non- Textured Images 


To test the hardware architecture and the identification algorithm we have 
tested this system with some general textured and non-textured images. 
These images are classified as follows' 


Class A: 
Class B: 
Class C: 
Class D: 
Class E; 
Class F: 
Class G: 
Class H: 
Class I: 
Class J: 
Class K: 


Brodatz Texture 
Garden Images 
Miss America 
Salesman 
Coil Break 
Black Patch 
Scratch 
Pinch 
Claire 
Calendar 
Roll Mark 


The images are of size (360 x 288). Fig. 4.10 shows typical samples of images 
of the new class A new set of data base is prepared by extracting the features 
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Table 4.3: Results of System Evaluation Using Steel Surface Defec- 
tive Images 






(a) Claire Image (b) Calender Image 



(c) Miss America 



(d) Garden Image 



(e) Salesman Image 
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as explained in earlier section and these feature vectors are stored in the 
data base memory locations. For testing the identification characteristics 
of the system some of the sample images which are not used for data base 
preparation are applied as if they are grabbed on-line. The similarity metrics 
for all these samples are calculated in real time. Result of these measures are 
shown in table 4.4. The identification time required for these images (360 x 
288) using implementation scheme-II with three TMS320C40 DSP processors 
setup is 73msec. 

The result in table 4.4 shows that the proposed system can also be 
used for identification of general textured and non-textured images. 

Since the basic feature extraction depends on the histogram of an 
image and histogram is highly dependent on illumination level. If illumi- 
nation level is uniform but not constant then the complete histogram shifts 
in to intensity space drastically and in the feature space feature vector of 
that sample image varies quite in amount from their class feature which 
in turn gives wrong identification result. In laboratory evaluation we in- 
crcased/decreased the brightness or /and contrast of images using image ma- 
nipulation program (gimp). 
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Table 4.4; Result of System Evaluation Using Textured and Non- 
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Chapter 5 

Conclusions and Future 
Extensions 


In ihivS thesis work we have proposed an implementation of a real time iden- 
tification system using high speed parallel DSP processors. The software 
module is mainly written ( application program ) in ’C’ language, except for 
the (•.ommiinication and initialization modules which are written m assembly 
language. 


5.1 Conclusions 

• Features are extracted from intensity histogram at various resolutions. 
This not only reduces the computational complexity but has proved 
to be a robust method for environments where image translations and 
rotations can not be avoided. 

• Since the system works in real time, identification time required should 
be well within the specification It is experimentally measured that the 
proposed system will be able to identify images of size 1024 x 1024 in 
less than 1 sec. Results of laboratory evaluation of this system with 
textured as well as non-textured images are found to be acceptable. 
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• As, histogram is invariant to translation, rotation etc., the result of 
identification does not depend on these parameters. However, his- 
togram is very sensitive to illumination, and maintaining an uniform 
and constant illumination is a crucial requirement. 

• Result of the laboratory evaluation of this system shows that the system 
can be used for identification of faces, finger prints etc., for which a 
data base is prepared and feature vectors stored. We can also think 
of using this system as smart sensors for smart robot. There are wide 
application of this system, and one can think of using it as an intelligent 
system in a well disciplined environment 

• Performance of the system would be better in terms of processing speed 
if PCI bus architecture was used in place of ISA bus architecture for 
purpose of communication between frame grabber card and PC ( Host 

)• 


5.2 Scope for Future Works 

• Once the system performance and algorithm is found to be satisfactory 
in industrial environments, the processing speed can be improved by 
converting and optimizing the ’C’ routines in assembly language and 
finally putting the overall code in EEPROM so that when system is 
switched on, it will start its specified work without user request. 

• This system does not specify the exact positions of defects m a particu- 
lar defective image frame. It only identifies the whole frame as defective 
or non-defective. With the help of block histogram ( spatial segmen- 
tation of image frame and their histogram ) at various resolutions we 
can specify the positions of defects exactly in that image frame 

. Present system is not being tested for more than one defect occurring 
in a particular image frame. With the help of spatial segmentation and 
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block histograms at various resolutions we can find the feature vectors 
for all the blocks separately and store it in the data base for similarity 
measurements. By segmenting the on-line grabbed images in the same 
way as the data base is prepared, we can identify more than one defect 
in the grabbed on-line images. 

• Performance of the system will be better if we can extract the features 
from color histogram at various resolutions in place of extracting the 
features from intensity histograms. This requires a color camera wtth 
color image grabber card. 

• Processing time can be improved by decomposing the identification 
technique into two levels. Defect detection algorithm can be imple- 
mented at the first level. If defects are present, only those images are 
sent to the classification module. 

• In place of using software routines for defect detection we can use smart 
sensors for defect detection followed by software routines for classifica- 
tion. This improves the system performance drastically because the 
hardware is always faster, but reduces the portability of system. 




1.1. T., 


12M4S 
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Appendix 


Appendix A.l 

1 Approximation operator A23 is linear operator. If is approximation 
of some signal f(x) (f(x) € L‘^(R) ) at resolution 2 ^, then, A23 is not 
modified if we approximate it once again at the same resolution 2 ^ . i e 
A‘2j 0A21 = A21 ■ Thus A2 is a projection operator on a particular vector 
space V2J C The vector space V23 can be interpreted as set of 

all possible approximations at the resolution 2^ of functions in {R) 

2 . Among all the approximated function at resolution 2^, A2J is most 
similar to f(x). i.e, \fg{x) € V23, {modg{x) - f{x)) > 11 ^ 2 ^, Hence, the 
operator A2J is an orthogonal projection on the vector space V23 ■ 

3 . The approximation of a signal at resolution 2 ^'^’' contains all the nec- 
essary information to compute the same signal at resolution 2 T This 
is a causality property. Vj E Z, ¥21 C V23+1 

4 . An approximation operator is similar at all resolutions. The spaces 
of approximated function should thus be derived from one another by 
scaling each approximated function by ratio of their resolution values, 

Vj E Z, f{x) E V23 /( 2 a;) E ^2^+1 

5 . The approximation A23 of a signal f(x) can be characterize by 2 ^ samples 
per unit length. When f(x) is translated by a length proportional to 
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2 A 2 jf(x) IS translated by the same amount and is characterized by 
the same samples which have been translated. 

6 When computing an approximation of f (x) at resolution 2 ^ , some infor- 
mation about f(x) is lost. However, as the resolution increases to -foo 
the approximated signal should converge to original signal. Conversely 
as the resolution decreases to zero, the approximated signal contains 
less and less information and converges to zero. 

Appendix A. 2 


This appendix describes the algorithm for computing fast discrete wavelet 
transform. Let G and H are decomposition low and high pass filters respec- 
tively. Lot is digitized image at resolution j. We denote ( A * B ) 
as convolution of discrete signal A and B. At each scale 2^, it decomposes 
discrete signal S 2 J in to coarse signal S^+if and detail signal W^+i/- 


j=0 

while ()<■!); J :maximum number of level of decomposition 


.j=j+l 


end of while 
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Appendix B.l 


DMA channel arbitration in split mode is described in figure shown below. 



CACK: Channel Acknowledge ( Active low) 
CSTRB: Channel Strobe ( Active low) 
CRDY: Channel Ready ( Active low) 

CD ( 7-0 ): Channel data lines 


As shown in figure there is only one temporary register in each chan- 
n(d. Therefore, a primary channel operation must complete before start an 
auxiliary channel operation can begin, and vice versa 
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Appendix C.l 


For implementation of wavelet transform on three processors with 256 bin 
histogram, the histogram is at first divided into three blocks and the number 
of coefficients in each block is given by, 

86 + 85 + 85 = 256 (total) 

Now the first block ( 86 coefficients ) is with master processor ( TMS320G40 
of grabber card ) and next two blocks { 85 coefficients each ) are sent to two 
slave processors ( TMS320C40 of DSP board ). Since we used Haar wavelet 
the number of Haar coefficients is M=2 Wavelet transform implementation 
technique is explained below: 

First level of decomposition: 

For first block ( Master Processor, TMS320C40 of Grabber Card ) : 

Number of coefficients after the high pass and low pass decomposition filter 
( hoK! filter is implemented as convolution ) = 86 +2-1= 87. 

Number of coefficients after dyadic decimation ( even ) = 44, all the even 
co(ffiicients arc taken. 

For second block ( TMSS20C40-1 of PPG ): 

Number of coefficients after high pass and low pass decomposition filtering ( 
filter implemented as convolution )= 85+2-1=86. 

Number of coefficients after dyadic decimation ( even ) = 43, all even coeffi- 
cients are taken. 

For third block ( TMS320C40-2 of PPG ): 

Number of coefficients after high pass and low pass decomposition filtering ( 
filter implemented as convolution )= 85+2-1=86. 

Number of coefficients after dyadic decimation ( odd ) = 43, all odd coeffi- 
cients are taken. 

These intermediate coefficients are stored for further decomposition 
and sent to master frocessor only after they are decomposed up to final lev- 

els. 
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Second Level of Decomposition: 

For first block ( Master processor ): 

Number of coefficients after high pass and low pass decomposition filtering 
= 44+2-1=45. 

Number of coefficients after dyadic decimation ( even )= 23, all even coeffi- 
ci(uit.s arc; taken. 

For second block ( TMS320C40-1 ). 

Number of coefficients after high pass and low pass decomposition filtering 
= 43+2-1=44. 

Number of coefficients after dyadic decimation ( odd ) = 22, all odd coeffi- 
cients are taken. 

For third block ( TMS320C40-2 ): 

Number of coefficients after high pass and low pass decomposition filtering 
= 43+2-1=44. 

Number of coefficients after dyadic decimation (even )= 22, all even coeffi- 
cients are taken. 

Third Level of Decomposition : 

For first block ( Master Processor )• 

Number of coefficients after high pass and low pass decomposition filtering 
= 23 +2-1=24. 

Number of coefficients after dyadic decimation ( even )= 12, all even coeffi- 
cicuits are taken 

For second block ( TMS320C40-1 ): 

Number of coefficients after high pass and low pass decomposition filtering 
= 22 +2-1= 23. 

Number of coefficients after dyadic decimation ( even )= 12, all even coeffi- 
cients are taken. 

For third block ( TMS320C40-2 ): 

Number of coefficients after high pass and low pass decomposition filtering 
= 22+2-1=23. 

Number of coefficients after dyadic decimation ( odd )= 11, all odd coeffi- 
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c.icnts arc; taken. 


Aft(.i decomposition of each blocks separately by all the three proces- 
sors Ihoy arc sent to master processor for final eoeffleients calcnlations and 
feature extractions. The technique employed to get final coefficients at all 
the thi(!(! levels are shown below 

Fir, ‘it level final coefficients ( total number of coefficients=129 ) : 

44''^ ccxhlicionts of block and of 2"'^ block is overlapped and added i.e., 

0,1, 2, 3,. . .,42,43 

0,1, 2, ...,40,41, 42 

0,1,2,... ,40,41, 42 

Second level final coefficients ( total number of coefficients = 65 ) : 

To get tlu^ linal second level decomposition coefficients 23'’'* coefficients of 
block and P^ coefficients of block and 22""^ coefficients of 2"'^ block and 
P^ coefficients of 3'’'^ block is overlapped and added, i.e, 

0 , 1 , 2 , ..., 20 , 21, 22 
0 , 1 , 2 ,... , 20,21 
0 , 1 , 2 , ..., 20,21 

Third level final coefficients ( total number of coefficients = 3S ) : 

To get the the final third level coefficients, 12*^ coefficients of P^ block and 
P*' coefficient of 2”“^ block and 12*^ coefficient of 2”^ block and P* coefficient 
of block is overlapped and added, i.e, 

0 , 1 , 2 ,... , 10 , 11,12 
0 , 1 , 2 ,... , 10 , 11,12 
0 , 1 , 2 ,. , 10,11 
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Appendix C.2 


In this section we explain in detail the wavelet transform implementation of 
256 bins histogram on two processors. The histogram vector is divided in 
two blocks of equal size, i.e the number of coefficients in each blocks are 128. 
Three level of decomposition are performed on each block. The interesting 
part of this implementation scheme is that we don’t calculate the final coef- 
ficients at every level of decomposition, but the coefficients of all the levels 
are sent parallely to the master processor for feature extraction. The imple- 
imuitation scheme is explained below 
First Level of Decomposition : 

First block ( TMSS20C40-1 ) : 

Number of coefficients after high pass and low pass decomposition filtering 
= 128 -1-2-1=129. 

Numlicr of coefficients after dyadic decimation ( even ) = 65, all even coeffi- 

ci(uits are taken i 

Second block ( TMS320C40-2 ) ■ 

Number of coefficients after high pass and low pass decomposition filtering 
= 128-1-2-1=129. 

Number of coefficients after low pass and high pass decomposition filtering 
= 128-1-2-1=129. 

Number of coefficients after dyadic decimation ( even ) = 65 
Second Level of Decomposition : 

For the first block ( TMS320C40-1 ) •' 

Number of coefficients after high pass and low pass decomposition filtering 
= 65 -f2-l= 66. 

Number of coefficients after dyadic decimation ( even ) = 33. 

For the second block ( TMS320C40-2 ) : 

Number of coefficients after high pass and low pass decomposition filtering 
= 65-1-2-1=66. 

Number of coefficients after dyadic decimation ( odd ) = 33. 
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Third Level of Decomposition : 

First block ( TMSS20C40-1 ) : 

Number of coefficients after high pass and low pass decomposition filtering 
= 33+2-1= 34. 

Number of coefficients after dyadic decimation ( even ) = 17 
Second block ( TMS320C40-2 ) : 

Number of coefficients after high pass and low pass decomposition filtering 
= 33+2-1=34 

Number of coefficients after dyadic decimation ( even ) = 17. 

These coefficients are overlapped and added to get the final coefficients 
values at every level of decompositions and finally features are extracted 
from these coefficients. The overlap and add scheme for this implementa- 
tion technique is shown below: First Level Decomposition Coefficients ( 129 
coefficients ): 

0,1,2,. .,62,63,64 

0 , 1 , 2 ,... ,62,63,64 

Second Level Decomposition Coefficients ( 65 coefficients ) : 


0,1,2, ...,30,31, 32 


0,1,2, ..,30,31,32 


Third Level Decomposition Coefficients ( 33 coefficients ) . 


0,1,2,. ..,14,15,16 

0,1, 2,... ,14,15,16 
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